bi-level score matching
Bi-level Score Matching for Learning Energy-based Latent Variable Models
Score matching (SM) provides a compelling approach to learn energy-based models (EBMs) by avoiding the calculation of partition function. However, it remains largely open to learn energy-based latent variable models (EBLVMs), except some special cases. This paper presents a bi-level score matching (BiSM) method to learn EBLVMs with general structures by reformulating SM as a bi-level optimization problem. The higher level introduces a variational posterior of the latent variables and optimizes a modified SM objective, and the lower level optimizes the variational posterior to fit the true posterior. To solve BiSM efficiently, we develop a stochastic optimization algorithm with gradient unrolling. Theoretically, we analyze the consistency of BiSM and the convergence of the stochastic algorithm. Empirically, we show the promise of BiSM in Gaussian restricted Boltzmann machines and highly nonstructural EBLVMs parameterized by deep convolutional neural networks. BiSM is comparable to the widely adopted contrastive divergence and SM methods when they are applicable; and can learn complex EBLVMs with intractable posteriors to generate natural images.
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.64)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.60)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)
Review for NeurIPS paper: Bi-level Score Matching for Learning Energy-based Latent Variable Models
Weaknesses: The authors neglect to compare to probably the 2 most related works I am aware of. The authors briefly mention variational noise contrastive estimation which can also be used to train models like those presented in this work. While this method has not yet been shown to scale to high dimensional image data it should be used as a comparison for the toy data at the very least. This work: "Variational Inference for Sparse and Undirected Models" Ingraham & Marks provides a method for parameter inference in EBLVMs. This method could also be used for comparison but at the very least should be included in the related work. The proposed method requires 2 inner loop optimizations (N x K) for each model gradient update.
Review for NeurIPS paper: Bi-level Score Matching for Learning Energy-based Latent Variable Models
All reviewers agree this is interesting work that succefsully trains energy-based latent variable models with score matching. There were concerns around clarity of the algorithm, utility of latent variables, complexity of the bi-level optimization proess, and missing baselines, which should all be addressed (as promised in the rebuttal) in the final verison of the paper.
Bi-level Score Matching for Learning Energy-based Latent Variable Models
Score matching (SM) provides a compelling approach to learn energy-based models (EBMs) by avoiding the calculation of partition function. However, it remains largely open to learn energy-based latent variable models (EBLVMs), except some special cases. This paper presents a bi-level score matching (BiSM) method to learn EBLVMs with general structures by reformulating SM as a bi-level optimization problem. The higher level introduces a variational posterior of the latent variables and optimizes a modified SM objective, and the lower level optimizes the variational posterior to fit the true posterior. To solve BiSM efficiently, we develop a stochastic optimization algorithm with gradient unrolling.